feat(flight-sql): stabilize auth/session lifecycle, enable USE persistence, and harden Flight SQL runtime#17207
Open
CritasWang wants to merge 2 commits intoai-code/flight-sqlfrom
Open
feat(flight-sql): stabilize auth/session lifecycle, enable USE persistence, and harden Flight SQL runtime#17207CritasWang wants to merge 2 commits intoai-code/flight-sqlfrom
CritasWang wants to merge 2 commits intoai-code/flight-sqlfrom
Conversation
…integration test 修复 Flight SQL 集成测试中的 "end-of-stream mid-frame" HTTP/2 帧截断错误。 Root cause / 根本原因: The gRPC default thread pool executor fails to properly handle subsequent RPCs on the same HTTP/2 connection in the DataNode JVM environment, where standalone Netty JARs coexist with grpc-netty bundled in the fat jar. DataNode JVM 环境中,gRPC 默认线程池执行器无法正确处理同一 HTTP/2 连接上 的后续 RPC 调用。根因是类路径上独立的 Netty JAR 与 fat jar 中捆绑的 grpc-netty 产生冲突。 Fix / 修复方案: 1. directExecutor() — run gRPC handlers in the Netty event loop thread, bypassing the default executor's thread scheduling issues (关键修复) 2. flowControlWindow(1MB) — explicit HTTP/2 flow control prevents framing errors when duplicate Netty JARs coexist on the classpath 3. Exclude io.netty from fat jar POM — use standalone Netty JARs already on the DataNode classpath instead of bundling duplicates Additional bug fixes / 其他修复: - TsBlockToArrowConverter: fix NPE when getColumnNameIndexMap() returns null for SHOW DATABASES queries (回退到列索引) - FlightSqlAuthHandler: add null guards in authenticate() and appendToOutgoingHeaders() for CallHeaders with null internal maps - FlightSqlAuthHandler: rewrite as CallHeaderAuthenticator with Bearer token reuse and Basic auth fallback - FlightSqlSessionManager: add user token cache for session reuse - IoTDBFlightSqlProducer: handle non-query statements (USE, CREATE, etc.) by returning empty FlightInfo, use TicketStatementQuery protobuf format Test changes / 测试改动: - Use fully qualified table names (database.table) instead of USE statement to keep each test to one GetFlightInfo + one DoGet RPC per connection - All 5 integration tests pass: testShowDatabases, testQueryWithAllDataTypes, testQueryWithFilter, testQueryWithAggregation, testEmptyResult
22d7342 to
430f9c7
Compare
…ning - Add x-flight-sql-client-id header support for per-client USE database isolation via FlightSqlAuthHandler and ClientIdMiddlewareFactory - Use \0 (null byte) delimiter in clientSessionCache key to prevent username/clientId collision attacks - Validate clientId: alphanumeric + dash only, max 64 chars, fail-closed for non-empty invalid values (SecurityException) - Add maximumSize(1000) to tokenCache and clientSessionCache to prevent resource exhaustion from arbitrary clientIds - Remove LoginLockManager (userId=-1L caused cross-user lock collision; getUserId() is blocking RPC incompatible with directExecutor()) - Remove unused flightClient field from IT - Add directExecutor() + HTTP/2 flow control window tuning (1MB) on NettyServerBuilder to fix end-of-stream mid-frame errors - Document all functional gaps vs SessionManager.login() (password expiration, login lock, checkUser cache-miss risk) Tests (9/9 pass): - 5 original Flight SQL query tests - testUseDbSessionPersistence: USE context persists across connections - testUseDbWithFullyQualifiedFallback: USE + qualified/unqualified queries - testUseDbIsolationAcrossClients: Client B fails without USE context - testInvalidClientIdRejected: non-empty invalid clientId rejected
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
Compared with base branch
ai-code/flight-sql, this branch addresses several Flight SQL stability and correctness issues:USE <database>context was not reliably preserved across client reconnections.What’s Changed
1) Auth and session model refactor
BasicCallHeaderAuthenticator + GeneratedBearerTokenAuthenticatorwith a customCallHeaderAuthenticator(FlightSqlAuthHandler).FlightSqlAuthMiddleware; auth is now fully centralized in header auth flow.FlightSqlSessionManagernow:AuthorityChecker.checkUser(),tokenCache(token -> session)clientSessionCache(client-key -> token),username + '\0' + clientIdto avoid key ambiguity,clientIdin fail-closed mode (invalid non-empty values are rejected),maximumSize(1000).2)
USE <database>persistence and per-client isolationx-flight-sql-client-id(in tests via client middleware injection).clientIdreuses the same session, preservingUSEcontext across connections.clientIds are isolated, preventing cross-client context leakage.clientIdrejection test to avoid silent fallback to shared sessions.3) Query execution and resource hardening (
IoTDBFlightSqlProducer)activeQueriesfromConcurrentHashMapto TTL-based Caffeine cache with eviction cleanup.USE/CREATE/INSERT/...) now return emptyFlightInfoinstead of failing query-stream logic.Any.pack(TicketStatementQuery)for Flight SQL dispatch compatibility.4) Flight SQL service lifecycle/runtime improvements
FlightSqlServiceto prevent start/stop race/reentry issues.directExecutor+ flow-control window tuning) to reduceend-of-stream mid-frameissues.arrow_flight_sql_max_allocator_memoryconfig loading inIoTDBConfig/IoTDBDescriptor.5) Dependency and packaging updates
arrow-memory-unsafe; excluded conflicting/redundant Netty transitive deps.flight-sql-jar-with-dependenciesinto integration-test assembly.information_schema.servicesexpectation to includeFLIGHT_SQL.arrow-memory-unsafe.6) Test coverage expansion
Enhanced
IoTDBArrowFlightSqlITwith:testUseDbSessionPersistencetestUseDbWithFullyQualifiedFallbacktestUseDbIsolationAcrossClientstestInvalidClientIdRejectedValidation
mvn -pl external-service-impl/flight-sql -DskipTests spotless:checkmvn -pl integration-test -P with-integration-tests -DskipTests spotless:check5/5passed.Known Limitations / Follow-ups
SessionManager.login()path is not used yet.Password-expiration checks and login-lock behavior still need follow-up (async auth or execution model adjustments).
clientIdremains backward-compatible (username-scoped session sharing).背景
当前分支相对基础分支
ai-code/flight-sql,主要解决了 Flight SQL 在 DataNode 环境中的几个核心问题:USE <database>上下文无法稳定生效。主要变更
1) 鉴权与会话模型重构(Flight SQL)
BasicCallHeaderAuthenticator + GeneratedBearerTokenAuthenticator模式调整为自定义CallHeaderAuthenticator(FlightSqlAuthHandler)。FlightSqlAuthMiddleware,统一通过 Header Authenticator 处理 Basic/Bearer。FlightSqlSessionManager改为:AuthorityChecker.checkUser()校验账号密码。tokenCache(token -> session)clientSessionCache(client-key -> token)username + '\0' + clientId,避免拼接歧义碰撞。clientId校验为 fail-closed:非空非法值直接拒绝(长度、字符集)。maximumSize(1000),避免无限增长。2) 修复
USE <database>生效与多客户端隔离x-flight-sql-client-idheader(IT 中通过 middleware 注入)。clientId复用同一会话,USE上下文可跨连接保持。clientId隔离会话,避免跨客户端数据库上下文串扰。clientId拒绝测试,防止静默降级到共享会话。3) 查询执行与资源管理增强(
IoTDBFlightSqlProducer)activeQueries从ConcurrentHashMap改为带 TTL 的 Caffeine cache,淘汰时自动 cleanup。USE/CREATE/INSERT)返回空FlightInfo,避免错误地按查询流处理。Any.pack(TicketStatementQuery),兼容 Flight SQL 协议分发。4) Flight SQL 服务启动与运行参数优化
FlightSqlService生命周期加锁,防止重复 start/stop 并发问题。directExecutor+ flow control window)以降低end-of-stream mid-frame类问题。arrow_flight_sql_max_allocator_memory配置读取逻辑(IoTDBConfig/IoTDBDescriptor)。5) 依赖与打包调整
flight-sql模块切换到arrow-memory-unsafe,并排除冲突/重复 Netty 依赖。integration-test组装中加入flight-sql-jar-with-dependencies。information_schema.services预期结果补充FLIGHT_SQL服务项。pom补充arrow-memory-unsafe依赖管理。6) 测试覆盖增强
IoTDBArrowFlightSqlIT新增并强化以下场景:testUseDbSessionPersistencetestUseDbWithFullyQualifiedFallbacktestUseDbIsolationAcrossClientstestInvalidClientIdRejected测试结果
mvn -pl external-service-impl/flight-sql -DskipTests spotless:checkmvn -pl integration-test -P with-integration-tests -DskipTests spotless:check5/5通过(本分支已完成)。已知限制 / 后续计划
SessionManager.login()全链路;密码过期检查、登录锁策略仍需后续通过异步化或执行模型调整补齐。
clientId时仍保持兼容行为(按用户名共享会话)。